Improving Analysis Of Data Mining By Creating Dataset Using Sql Aggregations
نویسندگان
چکیده
In Data mining, an important goal is to generate efficient data. Efficiency and scalability have always been important con-cerns in the field of data mining. The increased complexity of the task calls for algorithms that are inherently more expensive. To analyze data efficiently, Data mining systems are widely using datasets with columns in horizontal tabular layout. Preparing a data set is more complex task in a data mining project, requires many SQL queries, joining tables and aggregating columns. Conventional RDBMS usually manage tables with vertical form. Aggregated columns in a horizontal tabular layout returns set of numbers, instead of one number per row. The system uses one parent table and different child tables, operations are then performed on the data loaded from multiple tables. PIVOT operator, offered by RDBMS is used to calculate aggregate operations. PIVOT method is much faster method and offers much scalability. Partitioning large set of data, obtained from the result of horizontal aggregation, in to homogeneous cluster is important task in this system. K-means algorithm using SQL is best suited for implementing this operation. Index terms : PIVOT, SQL, Data Mining, Aggregation
منابع مشابه
Preparing Data Sets for the Data Mining Analysis using the Most Efficient Horizontal Aggregation Method in SQL
A huge amount of time is needed for making the dataset for the data mining analysis because data mining practitioners required to write complex SQL queries and many tables are to be joined to get the aggregated result. The traditional SQL aggregations prepare the data sets in vertical layout that is; they return result on one column per aggregated group. But for the data mining project, the dat...
متن کاملPrepare and Optimize Data Sets for Data Mining Analysis
Getting ready a data set for examination is usually the tedious errand in a data mining task, needing numerous complex SQL queries, joining tables and conglomerating sections. Existing SQL aggregations have limitations to get ready data sets since they give back one section for every amassed bunch. As a rule, a significant manual exertion is obliged to construct data sets, where a horizontal la...
متن کاملHorizontal Aggregations in SQL to Prepare Data Sets for Data Mining Analysis
Data mining is widely used domain for extracting trends or patterns from historical data. However, the databases used by enterprises can’t be directly used for data mining. It does mean that Data sets are to be prepared from real world database to make them suitable for particular data mining operations. However, preparing datasets for analyzing data is tedious task as it involves many aggregat...
متن کاملMulti Dimensionalised Aggregation in Horizontal Dataset Using Analysis Services
Projecting data in different dimensions is the core concept taken for this project. Preparing a data set for analysis is generally the most time consuming task in a data mining project. In the existing system they used simple, yet powerful, methods to generate SQL (Structured Query Language) code to return aggregated columns in a horizontal tabular layout, returning a set of numbers instead of ...
متن کاملA Better Approach for Horizontal Aggregations in SQL Using Data Sets for Data Mining Analysis
To analyzing the data efficiently in Data mining systems are widely using datasets with columns in horizontal tabular layout. Generally preparing a data set is the more complex task in a data mining project, require many complex SQL queries, aggregating columns and joining tables. Conventional RDBMS usually manage tables with vertical form. Aggregated columns in a horizontal tabular layout retu...
متن کامل